2 research outputs found
The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing.
Microbial ecology is plagued by problems
of an abstract nature. Cell sizes are so
small and population sizes so large that
both are virtually incomprehensible. Niches
are so far from our everyday experience
as to make their very definition elusive.
Organisms that may be abundant and
critical to our survival are little understood,
seldom described and/or cultured,
and sometimes yet to be even seen. One
way to confront these problems is to use
data of an even more abstract nature:
molecular sequence data. Massive environmental
nucleic acid sequencing, such
as metagenomics or metatranscriptomics,
promises functional analysis of microbial
communities as a whole, without prior
knowledge of which organisms are in the
environment or exactly how they are
interacting. But sequence-based ecological
studies nearly always use a comparative
approach, and that requires relevant
reference sequences, which are an extremely
limited resource when it comes to
microbial eukaryotes.
In practice, this means sequence databases
need to be populated with enormous
quantities of data for which we have
some certainties about the source. Most
important is the taxonomic identity of
the organism from which a sequence is
derived and as much functional identification
of the encoded proteins as possible. In
an ideal world, such information would be
available as a large set of complete, well curated,
and annotated genomes for all the
major organisms from the environment
in question. Reality substantially diverges
from this ideal, but at least for bacterial
molecular ecology, there is a database
consisting of thousands of complete genomes
from a wide range of taxa,
supplemented by a phylogeny-driven approach
to diversifying genomics [2]. For
eukaryotes, the number of available genomes
is far, far fewer, and we have relied
much more heavily on random growth of
sequence databases, raising the
question as to whether this is fit for
purpose